{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "VjoSNeI9RUKi"
      },
      "source": [
        "#**Advanced model training using hyperopt**\n",
        "\n",
        "In the Advanced Model Training tutorial we have already taken a look into hyperparameter optimasation using GridHyperparamOpt in the deepchem pacakge. In this tutorial, we will take a look into another hyperparameter tuning library called hyperopt.\n",
        "\n",
        "## Colab\n",
        "\n",
        "This tutorial and the rest in this sequence can be done in Google colab. If you'd like to open this notebook in colab, you can use the following link.\n",
        "\n",
        "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/deepchem/deepchem/blob/master/examples/tutorials/Hyperopt_training.ipynb)\n",
        "\n",
        "## Setup\n",
        "\n",
        "To run DeepChem and Hyperopt within Colab, you'll need to run the following installation commands. You can of course run this tutorial locally if you prefer. In that case, don't run these cells since they will download and install DeepChem and Hyperopt in your local machine again.\n",
        "\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 1,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 730
        },
        "id": "c70xCmMITvql",
        "outputId": "a0332832-4eba-499e-e02d-d7b56bcb17bb"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Collecting deepchem\n",
            "  Downloading deepchem-2.6.1-py3-none-any.whl (608 kB)\n",
            "\u001b[?25l\r\u001b[K     |▌                               | 10 kB 31.6 MB/s eta 0:00:01\r\u001b[K     |█                               | 20 kB 27.2 MB/s eta 0:00:01\r\u001b[K     |█▋                              | 30 kB 11.2 MB/s eta 0:00:01\r\u001b[K     |██▏                             | 40 kB 8.9 MB/s eta 0:00:01\r\u001b[K     |██▊                             | 51 kB 5.3 MB/s eta 0:00:01\r\u001b[K     |███▎                            | 61 kB 5.4 MB/s eta 0:00:01\r\u001b[K     |███▊                            | 71 kB 5.4 MB/s eta 0:00:01\r\u001b[K     |████▎                           | 81 kB 6.1 MB/s eta 0:00:01\r\u001b[K     |████▉                           | 92 kB 6.2 MB/s eta 0:00:01\r\u001b[K     |█████▍                          | 102 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |██████                          | 112 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |██████▌                         | 122 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |███████                         | 133 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |███████▌                        | 143 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |████████                        | 153 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |████████▋                       | 163 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |█████████▏                      | 174 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |█████████▊                      | 184 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |██████████▎                     | 194 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |██████████▊                     | 204 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |███████████▎                    | 215 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |███████████▉                    | 225 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |████████████▍                   | 235 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |█████████████                   | 245 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |█████████████▌                  | 256 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |██████████████                  | 266 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |██████████████▌                 | 276 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |███████████████                 | 286 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |███████████████▋                | 296 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |████████████████▏               | 307 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |████████████████▊               | 317 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |█████████████████▎              | 327 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |█████████████████▊              | 337 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |██████████████████▎             | 348 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |██████████████████▉             | 358 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |███████████████████▍            | 368 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |████████████████████            | 378 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |████████████████████▌           | 389 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |█████████████████████           | 399 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |█████████████████████▌          | 409 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |██████████████████████          | 419 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |██████████████████████▋         | 430 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |███████████████████████▏        | 440 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |███████████████████████▊        | 450 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |████████████████████████▎       | 460 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |████████████████████████▉       | 471 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |█████████████████████████▎      | 481 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |█████████████████████████▉      | 491 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |██████████████████████████▍     | 501 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |███████████████████████████     | 512 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |███████████████████████████▌    | 522 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |████████████████████████████    | 532 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |████████████████████████████▌   | 542 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |█████████████████████████████   | 552 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |█████████████████████████████▋  | 563 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |██████████████████████████████▏ | 573 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |██████████████████████████████▊ | 583 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |███████████████████████████████▎| 593 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |███████████████████████████████▉| 604 kB 5.2 MB/s eta 0:00:01\r\u001b[K     |████████████████████████████████| 608 kB 5.2 MB/s \n",
            "\u001b[?25hRequirement already satisfied: scipy in /usr/local/lib/python3.7/dist-packages (from deepchem) (1.4.1)\n",
            "Collecting numpy>=1.21\n",
            "  Downloading numpy-1.21.5-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB)\n",
            "\u001b[K     |████████████████████████████████| 15.7 MB 25.3 MB/s \n",
            "\u001b[?25hRequirement already satisfied: scikit-learn in /usr/local/lib/python3.7/dist-packages (from deepchem) (1.0.2)\n",
            "Requirement already satisfied: pandas in /usr/local/lib/python3.7/dist-packages (from deepchem) (1.3.5)\n",
            "Collecting rdkit-pypi\n",
            "  Downloading rdkit_pypi-2021.9.4-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (20.6 MB)\n",
            "\u001b[K     |████████████████████████████████| 20.6 MB 1.4 MB/s \n",
            "\u001b[?25hRequirement already satisfied: joblib in /usr/local/lib/python3.7/dist-packages (from deepchem) (1.1.0)\n",
            "Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/dist-packages (from pandas->deepchem) (2018.9)\n",
            "Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.7/dist-packages (from pandas->deepchem) (2.8.2)\n",
            "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.7.3->pandas->deepchem) (1.15.0)\n",
            "Requirement already satisfied: Pillow in /usr/local/lib/python3.7/dist-packages (from rdkit-pypi->deepchem) (7.1.2)\n",
            "Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from scikit-learn->deepchem) (3.1.0)\n",
            "Installing collected packages: numpy, rdkit-pypi, deepchem\n",
            "  Attempting uninstall: numpy\n",
            "    Found existing installation: numpy 1.19.5\n",
            "    Uninstalling numpy-1.19.5:\n",
            "      Successfully uninstalled numpy-1.19.5\n",
            "\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n",
            "yellowbrick 1.3.post1 requires numpy<1.20,>=1.16.0, but you have numpy 1.21.5 which is incompatible.\n",
            "datascience 0.10.6 requires folium==0.2.1, but you have folium 0.8.3 which is incompatible.\n",
            "albumentations 0.1.12 requires imgaug<0.2.7,>=0.2.5, but you have imgaug 0.2.9 which is incompatible.\u001b[0m\n",
            "Successfully installed deepchem-2.6.1 numpy-1.21.5 rdkit-pypi-2021.9.4\n"
          ]
        },
        {
          "data": {
            "application/vnd.colab-display-data+json": {
              "pip_warning": {
                "packages": [
                  "numpy"
                ]
              }
            }
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Requirement already satisfied: hyperopt in /usr/local/lib/python3.7/dist-packages (0.1.2)\n",
            "Requirement already satisfied: networkx in /usr/local/lib/python3.7/dist-packages (from hyperopt) (2.6.3)\n",
            "Requirement already satisfied: future in /usr/local/lib/python3.7/dist-packages (from hyperopt) (0.16.0)\n",
            "Requirement already satisfied: pymongo in /usr/local/lib/python3.7/dist-packages (from hyperopt) (4.0.1)\n",
            "Requirement already satisfied: scipy in /usr/local/lib/python3.7/dist-packages (from hyperopt) (1.4.1)\n",
            "Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from hyperopt) (1.21.5)\n",
            "Requirement already satisfied: tqdm in /usr/local/lib/python3.7/dist-packages (from hyperopt) (4.62.3)\n",
            "Requirement already satisfied: six in /usr/local/lib/python3.7/dist-packages (from hyperopt) (1.15.0)\n"
          ]
        }
      ],
      "source": [
        "!pip install deepchem\n",
        "!pip install hyperopt"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ZRpJ-BvHRTIE"
      },
      "source": [
        "## Hyperparameter Optimization via hyperopt\n",
        "\n",
        "Let's start by loading the HIV dataset.  It classifies over 40,000 molecules based on whether they inhibit HIV replication.\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "e0aDZ3aY6dkk",
        "outputId": "22e75f47-5d63-455c-b4da-47b420f204f9"
      },
      "outputs": [
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "'split' is deprecated.  Use 'splitter' instead.\n"
          ]
        }
      ],
      "source": [
        "import deepchem as dc\n",
        "tasks, datasets, transformers = dc.molnet.load_hiv(featurizer='ECFP', split='scaffold')\n",
        "train_dataset, valid_dataset, test_dataset = datasets\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vnr75060q1j9"
      },
      "source": [
        "Now, lets import the hyperopt library, which we will be using to fund the best parameters"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "metadata": {
        "id": "xM7G0LS1q1V_"
      },
      "outputs": [],
      "source": [
        "from hyperopt import hp, fmin, tpe, Trials"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ztSRJo3krDUm"
      },
      "source": [
        "Then we have to declare a dictionary with all the hyperparameters and their range that you will be tuning them in. This dictionary will serve as the search space for the hyperopt. \n",
        "Some basic ways of declaring the ranges in the dictionary are:\n",
        "\n",
        "\n",
        "\n",
        "*   hp.choice('label',[*choices*]) : this is used to specify a list of choices\n",
        "*   hp.uniform('label' ,low=*low_value* ,high=*high_value*) :  this is used to specify a uniform distibution between the low and high values. The values between them can be any real number, not necessaarily an integer.\n",
        "\n",
        "Here, we are going to use a multitaskclassifier to classify the HIV dataset and hence the appropriate search space is as follows.\n",
        "\n",
        "\n",
        "\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "vGeRn7oWUWo_"
      },
      "outputs": [],
      "source": [
        "search_space = {\n",
        "    'layer_sizes': hp.choice('layer_sizes',[[500], [1000], [2000],[1000,1000]]),\n",
        "    'dropouts': hp.uniform('dropout',low=0.2, high=0.5),\n",
        "    'learning_rate': hp.uniform('learning_rate',high=0.001, low=0.0001)\n",
        "}"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "HJzohvgFs70N"
      },
      "source": [
        "We should then declare a function to be minimized by the hyperopt. So, here we should use the function to minimize our multitaskclassifier model. Additionally, we are using a validation callback to validate the classifier for every 1000 steps, then we are passing the best score as the return. The metric used here is 'roc_auc_score', which needs to be maximized. To maximize a non-negative value is equivalent to minimize its opposite number, hence we are returning the negative of the validation score.\n",
        "\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "iEFj6HuetG1P"
      },
      "outputs": [],
      "source": [
        "import tempfile\n",
        "#tempfile is used to save the best checkpoint later in the program.\n",
        "\n",
        "metric = dc.metrics.Metric(dc.metrics.roc_auc_score)\n",
        "\n",
        "def fm(args):\n",
        "  save_dir = tempfile.mkdtemp()\n",
        "  model = dc.models.MultitaskClassifier(n_tasks=len(tasks),n_features=1024,layer_sizes=args['layer_sizes'],dropouts=args['dropouts'],learning_rate=args['learning_rate'])\n",
        "  #validation callback that saves the best checkpoint, i.e the one with the maximum score.\n",
        "  validation=dc.models.ValidationCallback(valid_dataset, 1000, [metric],save_dir=save_dir,transformers=transformers,save_on_minimum=False)\n",
        "  \n",
        "  model.fit(train_dataset, nb_epoch=25,callbacks=validation)\n",
        "\n",
        "  #restoring the best checkpoint and passing the negative of its validation score to be minimized.\n",
        "  model.restore(model_dir=save_dir)\n",
        "  valid_score = model.evaluate(valid_dataset, [metric], transformers)\n",
        "\n",
        "  return -1*valid_score['roc_auc_score']"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "McC5wuJax6IR"
      },
      "source": [
        "Here, we are calling the fmin function of the hyperopt, where we pass on the function to be minimized, the algorithm to be followed, max number of evals and a trials object. The Trials object is used to keep All hyperparameters, loss, and other information, this means you can access them after running optimization. Also, trials can help you to save important information and later load and then resume the optimization process.\n",
        "\n",
        "Moreover, for the algorithm there are three choice which can be used without any additional configuration. they are :-  \n",
        "\n",
        "\n",
        "*   Random Search - rand.suggest\n",
        "*   TPE (Tree Parzen Estimators) - tpe.suggest\n",
        "*   Adaptive TPE - atpe.suggest\n",
        "\n",
        "\n",
        "\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "e1i3yc7ECWVq",
        "outputId": "15c74c9e-8a15-49eb-bf97-a20cc9523043"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "\r  0%|          | 0/15 [00:00<?, ?it/s, best loss: ?]Step 1000 validation: roc_auc_score=0.777648\n",
            "Step 2000 validation: roc_auc_score=0.755485\n",
            "Step 3000 validation: roc_auc_score=0.739519\n",
            "Step 4000 validation: roc_auc_score=0.764756\n",
            "Step 5000 validation: roc_auc_score=0.757006\n",
            "Step 6000 validation: roc_auc_score=0.752609\n",
            "Step 7000 validation: roc_auc_score=0.763002\n",
            "Step 8000 validation: roc_auc_score=0.749202\n",
            "  7%|▋         | 1/15 [05:37<1:18:46, 337.58s/it, best loss: -0.7776476459925534]Step 1000 validation: roc_auc_score=0.750455\n",
            "Step 2000 validation: roc_auc_score=0.783594\n",
            "Step 3000 validation: roc_auc_score=0.775872\n",
            "Step 4000 validation: roc_auc_score=0.768825\n",
            "Step 5000 validation: roc_auc_score=0.769555\n",
            "Step 6000 validation: roc_auc_score=0.765324\n",
            "Step 7000 validation: roc_auc_score=0.771146\n",
            "Step 8000 validation: roc_auc_score=0.760138\n",
            " 13%|█▎        | 2/15 [07:05<41:16, 190.51s/it, best loss: -0.7835939030962179]  Step 1000 validation: roc_auc_score=0.744178\n",
            "Step 2000 validation: roc_auc_score=0.765406\n",
            "Step 3000 validation: roc_auc_score=0.76532\n",
            "Step 4000 validation: roc_auc_score=0.769255\n",
            "Step 5000 validation: roc_auc_score=0.77029\n",
            "Step 6000 validation: roc_auc_score=0.768024\n",
            "Step 7000 validation: roc_auc_score=0.764157\n",
            "Step 8000 validation: roc_auc_score=0.756805\n",
            " 20%|██        | 3/15 [09:40<34:53, 174.42s/it, best loss: -0.7835939030962179]Step 1000 validation: roc_auc_score=0.714572\n",
            "Step 2000 validation: roc_auc_score=0.770712\n",
            "Step 3000 validation: roc_auc_score=0.777914\n",
            "Step 4000 validation: roc_auc_score=0.76923\n",
            "Step 5000 validation: roc_auc_score=0.774823\n",
            "Step 6000 validation: roc_auc_score=0.775927\n",
            "Step 7000 validation: roc_auc_score=0.777054\n",
            "Step 8000 validation: roc_auc_score=0.778508\n",
            " 27%|██▋       | 4/15 [12:12<30:22, 165.66s/it, best loss: -0.7835939030962179]Step 1000 validation: roc_auc_score=0.743939\n",
            "Step 2000 validation: roc_auc_score=0.759478\n",
            "Step 3000 validation: roc_auc_score=0.738839\n",
            "Step 4000 validation: roc_auc_score=0.751084\n",
            "Step 5000 validation: roc_auc_score=0.740504\n",
            "Step 6000 validation: roc_auc_score=0.753612\n",
            "Step 7000 validation: roc_auc_score=0.71802\n",
            "Step 8000 validation: roc_auc_score=0.761025\n",
            " 33%|███▎      | 5/15 [17:40<37:21, 224.16s/it, best loss: -0.7835939030962179]Step 1000 validation: roc_auc_score=0.74099\n",
            "Step 2000 validation: roc_auc_score=0.767516\n",
            "Step 3000 validation: roc_auc_score=0.767338\n",
            "Step 4000 validation: roc_auc_score=0.775691\n",
            "Step 5000 validation: roc_auc_score=0.768731\n",
            "Step 6000 validation: roc_auc_score=0.755029\n",
            "Step 7000 validation: roc_auc_score=0.767115\n",
            "Step 8000 validation: roc_auc_score=0.764744\n",
            " 40%|████      | 6/15 [22:48<37:54, 252.71s/it, best loss: -0.7835939030962179]Step 1000 validation: roc_auc_score=0.713761\n",
            "Step 2000 validation: roc_auc_score=0.759518\n",
            "Step 3000 validation: roc_auc_score=0.765853\n",
            "Step 4000 validation: roc_auc_score=0.771976\n",
            "Step 5000 validation: roc_auc_score=0.772762\n",
            "Step 6000 validation: roc_auc_score=0.773206\n",
            "Step 7000 validation: roc_auc_score=0.775565\n",
            "Step 8000 validation: roc_auc_score=0.768521\n",
            " 47%|████▋     | 7/15 [27:53<35:58, 269.84s/it, best loss: -0.7835939030962179]Step 1000 validation: roc_auc_score=0.717178\n",
            "Step 2000 validation: roc_auc_score=0.754258\n",
            "Step 3000 validation: roc_auc_score=0.767905\n",
            "Step 4000 validation: roc_auc_score=0.762917\n",
            "Step 5000 validation: roc_auc_score=0.766162\n",
            "Step 6000 validation: roc_auc_score=0.767581\n",
            "Step 7000 validation: roc_auc_score=0.770746\n",
            "Step 8000 validation: roc_auc_score=0.77597\n",
            " 53%|█████▎    | 8/15 [30:36<27:29, 235.64s/it, best loss: -0.7835939030962179]Step 1000 validation: roc_auc_score=0.74314\n",
            "Step 2000 validation: roc_auc_score=0.757408\n",
            "Step 3000 validation: roc_auc_score=0.76668\n",
            "Step 4000 validation: roc_auc_score=0.768104\n",
            "Step 5000 validation: roc_auc_score=0.746377\n",
            "Step 6000 validation: roc_auc_score=0.745282\n",
            "Step 7000 validation: roc_auc_score=0.74113\n",
            "Step 8000 validation: roc_auc_score=0.734482\n",
            " 60%|██████    | 9/15 [36:53<28:00, 280.04s/it, best loss: -0.7835939030962179]Step 1000 validation: roc_auc_score=0.743204\n",
            "Step 2000 validation: roc_auc_score=0.76912\n",
            "Step 3000 validation: roc_auc_score=0.769981\n",
            "Step 4000 validation: roc_auc_score=0.784163\n",
            "Step 5000 validation: roc_auc_score=0.77536\n",
            "Step 6000 validation: roc_auc_score=0.779237\n",
            "Step 7000 validation: roc_auc_score=0.782344\n",
            "Step 8000 validation: roc_auc_score=0.779085\n",
            " 67%|██████▋   | 10/15 [38:23<18:26, 221.33s/it, best loss: -0.7841634210268469]Step 1000 validation: roc_auc_score=0.743565\n",
            "Step 2000 validation: roc_auc_score=0.765063\n",
            "Step 3000 validation: roc_auc_score=0.75284\n",
            "Step 4000 validation: roc_auc_score=0.759978\n",
            "Step 5000 validation: roc_auc_score=0.74255\n",
            "Step 6000 validation: roc_auc_score=0.721809\n",
            "Step 7000 validation: roc_auc_score=0.729863\n",
            "Step 8000 validation: roc_auc_score=0.73075\n",
            " 73%|███████▎  | 11/15 [44:07<17:15, 258.91s/it, best loss: -0.7841634210268469]Step 1000 validation: roc_auc_score=0.695949\n",
            "Step 2000 validation: roc_auc_score=0.765082\n",
            "Step 3000 validation: roc_auc_score=0.756256\n",
            "Step 4000 validation: roc_auc_score=0.771923\n",
            "Step 5000 validation: roc_auc_score=0.758841\n",
            "Step 6000 validation: roc_auc_score=0.759393\n",
            "Step 7000 validation: roc_auc_score=0.765971\n",
            "Step 8000 validation: roc_auc_score=0.747064\n",
            " 80%|████████  | 12/15 [48:54<13:21, 267.23s/it, best loss: -0.7841634210268469]Step 1000 validation: roc_auc_score=0.757871\n",
            "Step 2000 validation: roc_auc_score=0.765296\n",
            "Step 3000 validation: roc_auc_score=0.769748\n",
            "Step 4000 validation: roc_auc_score=0.776487\n",
            "Step 5000 validation: roc_auc_score=0.775009\n",
            "Step 6000 validation: roc_auc_score=0.779539\n",
            "Step 7000 validation: roc_auc_score=0.763165\n",
            "Step 8000 validation: roc_auc_score=0.772093\n",
            " 87%|████████▋ | 13/15 [50:22<07:06, 213.15s/it, best loss: -0.7841634210268469]Step 1000 validation: roc_auc_score=0.720166\n",
            "Step 2000 validation: roc_auc_score=0.768489\n",
            "Step 3000 validation: roc_auc_score=0.782853\n",
            "Step 4000 validation: roc_auc_score=0.785556\n",
            "Step 5000 validation: roc_auc_score=0.78583\n",
            "Step 6000 validation: roc_auc_score=0.786569\n",
            "Step 7000 validation: roc_auc_score=0.779249\n",
            "Step 8000 validation: roc_auc_score=0.783423\n",
            " 93%|█████████▎| 14/15 [51:52<02:55, 175.93s/it, best loss: -0.7865693280913189]Step 1000 validation: roc_auc_score=0.743232\n",
            "Step 2000 validation: roc_auc_score=0.762007\n",
            "Step 3000 validation: roc_auc_score=0.771809\n",
            "Step 4000 validation: roc_auc_score=0.755023\n",
            "Step 5000 validation: roc_auc_score=0.769812\n",
            "Step 6000 validation: roc_auc_score=0.769867\n",
            "Step 7000 validation: roc_auc_score=0.777354\n",
            "Step 8000 validation: roc_auc_score=0.775313\n",
            "100%|██████████| 15/15 [56:47<00:00, 227.13s/it, best loss: -0.7865693280913189]\n"
          ]
        }
      ],
      "source": [
        "trials=Trials()\n",
        "best = fmin(fm,\n",
        "    \t\tspace= search_space,\n",
        "    \t\talgo=tpe.suggest,\n",
        "    \t\tmax_evals=15,\n",
        "    \t\ttrials = trials)\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "5I9CzaylyjUw"
      },
      "source": [
        "The code below is used to print the best hyperparameters found by the hyperopt."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "aPeJaH89GD67",
        "outputId": "1e87f5a0-edbb-45ce-b407-66e7e37059dc"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Best: {'dropout': 0.3749846096922802, 'layer_sizes': 0, 'learning_rate': 0.0007544819475363869}\n"
          ]
        }
      ],
      "source": [
        "print(\"Best: {}\".format(best))\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "VD1ET6q6_naf"
      },
      "source": [
        "The hyperparameter found here may not be necessarily the best one, but gives a general idea on which parameters are effective. To get mroe accurate results, one has to increase the number of validation epochs and the epochs the model fit. But doing so may increase the time in finding the best hyperparameters."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Ng2npVmkyvpQ"
      },
      "source": [
        "# Congratulations! Time to join the Community!\n",
        "\n",
        "Congratulations on completing this tutorial notebook! If you enjoyed working through the tutorial, and want to continue working with DeepChem, we encourage you to finish the rest of the tutorials in this series. You can also help the DeepChem community in the following ways:\n",
        "\n",
        "## Star DeepChem on [GitHub](https://github.com/deepchem/deepchem)\n",
        "This helps build awareness of the DeepChem project and the tools for open source drug discovery that we're trying to build.\n",
        "\n",
        "## Join the DeepChem Discord\n",
        "The DeepChem [Discord](https://discord.gg/cGzwCdrUqS) hosts a number of scientists, developers, and enthusiasts interested in deep learning for the life sciences. Join the conversation!"
      ]
    }
  ],
  "metadata": {
    "colab": {
      "collapsed_sections": [],
      "name": "Advanced model training using hyperopt.ipynb",
      "provenance": []
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
